Home | | Latest | About | Random
# 37 Diagonalization: Part 2 - Characterizations of diagonalization, eigenbasis, eigenspaces, algebraic and geometric multiplicities.
We are discussing diagonalizability, and in this set of notes we like to develop characterizations of when is a square matrix diagonalizable.
## By definition of similarity and the factorization $PDP^{-1}$.
Firstly, by the definition, we say
> **Definition.**
> A square matrix $A$ is diagonalizable if $A$ is similar to some diagonal matrix $D$, where $A\stackrel{\text{sim}}\sim D$ diagonal.
To rephrase this using the definition of similarity, this means
> **Factorization characterization.**
> A square matrix $A$ is diagonalizable if and only if there exists an invertible matrix $P$ and a diagonal matrix $D$ such that we can _factorize_ $A = P D P^{-1}$.
Of course, the challenge is producing such invertible $P$ and diagonal $D$, which we will tackle in a bit.
But first, why is a diagonalizable matrix useful?
Suppose you want to compute a large power of a matrix, $A^{1000}$, how might we calculate this?
Here is a _naive but smart way_: One could approach this logarithmically, by first computing $A^{2}$, then its square $A^{4}$, then the square of that $A^{8}$, etc. This allows to compute powers of $A$ that are powers of $2$ easily, namely $A^{(2^{k})}$. And since $1000 = 512 + 256 + 128 + 64 + 32 +8$ (binary expansion), we simply have $A^{1000} = A^{512} A^{256} A^{128} A^{64} A^{32} A^{8}$. This greatly reduce the number of multiplications we need to perform.
However, if our matrix $A$ is diagonalizable, then something nice happens: We can factorize $A = PDP^{-1}$, then $$
A^{1000} = (PDP^{-1})^{1000} = (PDP^{-1})(PDP^{-1})\cdots (PDP^{-1})= PD^{1000}P^{-1},
$$because matrix multiplication is _associative_. Effectively, we trade the problem of computing $A^{1000}$ to $D^{1000}$. And since $D$ is diagonal, its power is simple to find:
> If $D$ is a diagonal matrix with diagonal entries given by $D=\text{diag}(\lambda_{1},\lambda_{2},\ldots,\lambda_{n})$, that is, $$
D = \begin{pmatrix}\lambda_{1}\\&\lambda_{2}\\&&\ddots\\&&&\lambda_{n}\end{pmatrix}
$$ then its $k$-th power is just $D^{k}=\text{diag}(\lambda_{1}^{k},\lambda_{2}^{k},\ldots,\lambda_{n}^{k})$, or $$
D^{k} = \begin{pmatrix}\lambda_{1}^{k}\\&\lambda_{2}^{k}\\&&\ddots\\&&&\lambda_{n}^{k}\end{pmatrix}
$$ raising the diagonal entries by $k$.
**CAUTION.** You cannot simply raise the entries of a matrix $A$ by $k$ to compute $A^{k}$! This is a diagonal matrix special!
Bottom-line: If a square matrix $A$ is diagonalizable, its power $A^{k}$ is easy to compute.
**Remark.** Some verbiage, sometimes we say "to _diagonalize_ a matrix" to mean to produce the factorization $PDP^{-1}$ for the matrix, where $P$ invertible and $D$ diagonal. We sometimes say the factorization $PDP^{-1}$ as the _diagonalization_ of $A$.
## Eigenbasis characterization.
Let us analyze carefully what it means for an $n\times n$ square matrix to be diagonalizable column by column. If $A = PDP^{-1}$ for some invertible $P$ and diagonal $D$, let's say $$
P=\begin{pmatrix}| & | & & | \\
\vec v_{1} & \vec v_{2} & \cdots & \vec v_{n} \\
| & | & & |\end{pmatrix}, D = \begin{pmatrix}\lambda_{1} \\
& \lambda_{2} \\
& & \ddots \\
& & & \lambda_{n}\end{pmatrix}
$$ then working out $AP=PD$ gives $$
\begin{pmatrix}| & | & & | \\
A\vec v_{1} & A\vec v_{2} & \cdots & A\vec v_{n} \\
| & | & & |\end{pmatrix} = \begin{pmatrix}| & | & & | \\
\lambda_{1}\vec v_{1} & \lambda_{2}\vec v_{2} & \cdots & \lambda_{n}\vec v_{n} \\
| & | & & |\end{pmatrix}
$$as we saw before. Since $P$ is invertible, its columns $\vec v_{1},\vec v_{2},\ldots,\vec v_{n}$ forms a _basis_. And since each $\vec v_{i}$ satisfies $A\vec v_{i} = \lambda_{i} \vec v_{i}$, we see that each of these basis vectors $\vec v_{i}$ is further an _eigenvector_ of $A$. This set of vectors $\beta = \{\vec v_{1},\vec v_{2},\ldots,\vec v_{n}\}$ is called an **eigenbasis** to the matrix $A$, it is a basis set whose basis vectors are all eigenvectors of $A$. So, this gives another characterization of diagonalizability,
> **Eigenbasis characterization.**
> Let $A$ be an $n\times n$ matrix over the scalars $\mathbb F$. Then $A$ is diagonalizable if and only if there exists an eigenbasis $\beta = \{\vec v_{1},\ldots,\vec v_{n}\}$ to $A$, where the $n$ vectors $\vec v_{1},\ldots,\vec v_{n}$ forms a basis to $\mathbb{F}^{n}$ and each $\vec v_{i}$ is an eigenvector of $A$.
> Further, if $\beta =\{\vec v_{1},\ldots,\vec v_{n}\}$ is an eigenbasis to $A$, then $A$ is diagonalized as $PDP^{-1}$ where $$
P=\begin{pmatrix}| & | & & | \\
\vec v_{1} & \vec v_{2} & \cdots & \vec v_{n} \\
| & | & & |\end{pmatrix}, D = \begin{pmatrix}\lambda_{1} \\
& \lambda_{2} \\
& & \ddots \\
& & & \lambda_{n}\end{pmatrix}
$$with each $\lambda_{i}$ the corresponding eigenvalue that $A\vec v_{i} = \lambda_{i} \vec v_{i}$.
**Example.**
Let us consider the $2\times 2$ matrix $A$ where $$
A = \begin{pmatrix}2 & 1 \\ 0 &1\end{pmatrix}.
$$Consider the basis set $\beta = \{\vec v_{1},\vec v_{2}\}$ for $\mathbb R^{2}$ where $$
\vec v_{1} = \begin{pmatrix}1\\0\end{pmatrix} \quad\text{and}\quad \vec v_{2}=\begin{pmatrix}1\\-1\end{pmatrix}
$$
Note that the basis vectors are in fact eigenvectors of $A$: $$
\begin{align*}
A\vec v_{1} = \begin{pmatrix}2 & 1\\0 & 1\end{pmatrix}\begin{pmatrix}1\\0\end{pmatrix} = \begin{pmatrix}2\\0\end{pmatrix} = 2\begin{pmatrix}1\\0\end{pmatrix}=2\vec v_{1} \\
A\vec v_{2} = \begin{pmatrix}2 & 1\\0 & 1\end{pmatrix}\begin{pmatrix}1\\-1\end{pmatrix} = \begin{pmatrix}1\\-1\end{pmatrix} = 1\begin{pmatrix}1\\-1\end{pmatrix}=1\vec v_{2}
\end{align*}
$$Hence $\beta$ is an eigenbasis to $A$, and $A$ is therefore diagonalizable! The _diagonalization_ of $A$ is given by $A = PDP^{-1}$ where $$
P = \begin{pmatrix}1 & 1 \\
0 & -1\end{pmatrix} \quad\text{and}\quad D = \begin{pmatrix}2 & 0 \\0 & 1\end{pmatrix}. \blacklozenge
$$Great, if we can find an eigenbasis to a square matrix $A$, then we have shown $A$ is diagonalizable, and can even write down its diagonalization $PDP^{-1}$.
But how do find this eigenbasis, if exists at all?
Also curiously, the matrix $A = \begin{pmatrix}2&1\\0&1\end{pmatrix}$ is diagonalizable, but $B = \begin{pmatrix}2&1\\0&2\end{pmatrix}$ turns out to be not diagonalizable! How do we know?
For this we need to develop how to find eigenvectors -- they live in eigenspaces.
## Interlude -- brief words about polynomials.
Let $p(t) =a_{n}t^{n}+a_{n-1}t^{n-1}+\cdots+a_{0}$ be a polynomial of degree $n$. Suppose $\lambda$ is a root of $p$ that repeated for a total of $m$ times, where $p(t)=q(t)(t-\lambda)^{m}$ for some $q$ such that $q(\lambda)\neq 0$. Then we say this $m$ is the **algebraic multiplicity of $\lambda$**, and denote it as $\text{am}(\lambda)$.
For example, the polynomial $p(t)=4(2-t)^{5}(8+t)^{3}t^{7}$ has roots $2,-8,0$ where their algebraic multiplicities are $$
\begin{array}{}
\text{root } \lambda & 2 & -8 & 0 \\
\text{am}(\lambda) & 5 & 3 & 7
\end{array}
$$
If a polynomial $p(t)$ can be factored completely into product of linear terms, like $$
p(t)=c (\lambda_{1}-t)^{m_{1}}(\lambda_{2}-t)^{m_{2}}\cdots(\lambda_{k}-t)^{m_{k}}
$$then we say the polynomial $p$ **splits** over $\mathbb F$ , where $c,\lambda_{i}$ are all scalars in $\mathbb F$.
It is possible for polynomials over the reals that do not split over the reals, for example $p(t)=t^{2}+1$ cannot be factored over the reals into product of linear terms. However, over the complex numbers this is a _non-issue_, that _every complex polynomial splits over $\mathbb C$_. This is because of:
> **Fundamental theorem of algebra.**
> A nonconstant polynomial $p(t)$ over $\mathbb C$ always have a root in $\mathbb C$. In particular, a degree $n$ polynomial $p(t)$ over $\mathbb C$ will split over $\mathbb C$, as a product of $n$ linear terms, some term may repeat.
**Example.**
The polynomial $p(t)=t^{2}+1$ does not split over $\mathbb R$, but splits over $\mathbb C$. Over the complex numbers we have $p(t) = (t-i)(t+i)$.
One thing to note is that if a degree $n$ polynomial $p$ splits, then summing over all the algebraic multiplicities across the distinct roots of $p$ gives $$
\sum_{\lambda} \text{am}(\lambda) =n=\deg(p),
$$
and in general $\sum_{\lambda}\text{am}(\lambda) \le n$.
## Eigenspaces.
Let us revisit how we find eigenvalues again, and why they come from roots of the characteristic polynomial.
Suppose $\lambda$ is an eigenvalue to some square matrix $A$, then there exists some nonzero vector $\vec v\neq 0$ such that $$
\begin{array}{}
& A\vec v = \lambda \vec v, & \vec v\neq 0 \\
\iff & (A -\lambda I)\vec v = \vec 0, & \vec v \neq 0 \\
\iff & \vec v\in \ker (A-\lambda I), & \vec v\neq 0 \\
\iff & A - \lambda I \text{ is not invertible} \\
\iff & \det(A - \lambda I) = 0
\end{array}
$$which we see that $\lambda$ is a root to the characteristic polynomial $p_{A}(t) = \det(A-tI)$.
But this deduction reveals even more, it shows that the eigenvector $\vec v$ associated to the eigenvalue $\lambda$ lives in $\ker(A-\lambda I)$. So we denote $$
E_{\lambda} = \ker (A-\lambda I)
$$as the **eigenspace** of the eigenvalue $\lambda$, where every eigenvector of eigenvalue $\lambda$ are found in $E_{\lambda}$. In fact, every vector in $E_{\lambda}$ except the zero vector is an eigenvector of $A$ with eigenvalue $\lambda$, and that is _all_ of them of with eigenvalue $\lambda$.
Also note that as $E_{\lambda}$ is the kernel of a matrix, it is a _linear space_, hence the terminology eigen_space_. And since it is a linear space, we can find a basis to $E_{\lambda}$ as well as its dimension, and we say the **geometric multiplicity of $\lambda$** to be the dimension of $E_{\lambda}$, with short hand $\text{gm}(\lambda) = \dim(E_{\lambda})$.
An important and useful fact,
> If $\lambda$ is an eigenvalue of $A$, then we always have $$
1 \le \text{gm}(\lambda) \le \text{am}(\lambda).
$$
(I will prove this separately in another note.)
Now, since to diagonalize a square matrix $A$ is to produce an eigenbasis to $A$, what we should do then is the following:
- First find all the eigenvalues $\lambda$ of $A$ by solving the roots of the characteristic polynomial $p_{A}(t)=\det(A-t I)$
- For each eigenvalue $\lambda$, compute the eigenspaces $E_{\lambda}$ (that's where we can find the eigenvectors), and write down a _basis_ to each $E_{\lambda}$
- See if we can put together an eigenbasis to $A$ by using the bases to each $E_{\lambda}$.
Let us see this in action.
**Example.**
Consider the matrices $$
A = \begin{pmatrix}2 & 1\\0 & 1\end{pmatrix} \quad\text{and}\quad B=\begin{pmatrix}2 & 1\\0 & 2\end{pmatrix}
$$Let us analyze the diagonalizability of the matrix $A$.
First its characteristic polynomial is $$
p_{A}(t) = \det(A-tI)=\det\begin{pmatrix}2-t & 1 \\ 0 & 1-t\end{pmatrix} = (2-t)(1-t)
$$so we see the roots of $p_{A}$ are $2,1$, which are the eigenvalues of $A$. Their algebraic multiplicities are $1$ each.
Now, for each eigenvalue $\lambda$, we will compute their eigenspace $E_{\lambda}$ and write down a basis for $E_{\lambda}$. Note $$
E_{2} = \ker (A-2I)=\ker \begin{pmatrix}0 & 1\\0 & -1\end{pmatrix}= \operatorname{span} \{\begin{pmatrix}1\\0\end{pmatrix}\}
$$and $$
E_{1} = \ker (A-I) = \ker \begin{pmatrix}1 & 1\\0 & 0\end{pmatrix} = \operatorname{span} \{\begin{pmatrix}-1\\1\end{pmatrix}\}.
$$So we can organize our information into this _eigen-table_ (this is a made up term): $$
\begin{array}{c|c|c}
\lambda & 2 & 1 \\\hline
\text{am}(\lambda) & 1 & 1 \\
\text{basis for } E_{\lambda} & \begin{pmatrix}1\\0\end{pmatrix} & \begin{pmatrix}-1 \\ 1\end{pmatrix} \\
\text{gm}(\lambda) & 1 & 1
\end{array}
$$
Now let's see, do we have enough linearly independent eigenvectors that forms a basis for $\mathbb R^{2}$? Yes, we have two of them, so $$
\begin{pmatrix}1\\0\end{pmatrix},\begin{pmatrix}-1\\1\end{pmatrix}
$$forms an eigenbasis to $A$. So $A$ is diagonalizable, and we can diagonalize $A$ as $PDP^{-1}$ where $$
P = \begin{pmatrix}1 & -1 \\0 & 1\end{pmatrix} \text{ and }D=\begin{pmatrix}2\\ & 1\end{pmatrix}.
$$
As for matrix $B$, note its characteristic polynomial is $$
p_{B}(t) = \det(B-tI)= \det \begin{pmatrix}2-t & 1\\0 & 2-t\end{pmatrix} = (2-t)^{2}.
$$So there is only eigenvalue $2$ with algebraic multiplicity of $2$. The eigenspace $E_{2}$ is $$
E_{2} = \ker \begin{pmatrix}0 & 1\\0 & 0\end{pmatrix} = \operatorname{span} \{\begin{pmatrix}1 \\ 0\end{pmatrix}\}
$$So we have _eigen-table_ for $B$ as follows $$
\begin{array}{c|c}
\lambda & 2\\ \hline
\text{am}(\lambda) & 2 \\
\text{basis for }E_{\lambda} & \begin{pmatrix}1\\0\end{pmatrix} \\
\text{gm}(\lambda) & 1
\end{array}
$$
This is an issue, we don't have enough linearly independent eigenvectors to form a basis of $\mathbb R^{2}$! There are no more eigenspaces to use and the available eigenspace $E_{2}$ is not big enough. Therefore there is no eigenbasis possible to $B$, and $B$ is not diagonalizable!
By the way, in this case where the geometric multiplicity is not big enough, that $\text{gm}(2) < \text{am}(2)$, we say the eigenvalue $2$ is **defective** or **deficient**.
$\blacklozenge$
This shows that to be diagonalizable, the geometric multiplicities cannot be deficient and must all be big enough that they add up to $n$, where $A$ is $n\times n$, so we can get a basis. Hence we have
> **Eigenspace characterization.**
> An $n\times n$ matrix $A$ is diagonalizable if and only if the sum of the geometric multiplicities across different eigenvalues sum to $n$, that $$
\sum_{\lambda} \text{gm}(\lambda) = n.
$$
Note in general for any square matrix, $\sum_{\lambda}\text{gm}(\lambda) \le n$.
There is one more observation we can make. If the characteristic polynomial does not split, then the sum of the algebraic multiplicities will be strictly less than $n$. And since each geometric multiplicity is no greater than the algebraic multiplicities, we see that a non-splitting characteristic polynomial would lead to non-diagaonlizability. So, we can rephrase this characterization as
> **Multiplicities characterization.**
> An $n\times n$ matrix $A$ is diagonalizable over $\mathbb F$ if and only if both (i) the characteristic polynomial $p_{A}$ splits over $\mathbb F$, and (ii) for each eigenvalue $\lambda$, we have $\text{am}(\lambda) = \text{gm}(\lambda)$.
Again, if we are working over the complex numbers, splitting polynomials is not a problem as they will always split over $\mathbb C$.
Let us look at some examples.
**Example.**
Consider the rotation by $90^{\circ}$ CCW matrix $R$, where $$
R = \begin{pmatrix}0 & -1\\1 & 0\end{pmatrix}.
$$
Its characteristic polynomial is $$
p_{R}(t) = \det \begin{pmatrix}-t & -1\\1 & -t\end{pmatrix} = t^{2} +1.
$$
Note, $p_{R}$ does not split over the reals. So, $R$ is **not** diagonalizable over $\mathbb R$.
However, of the complex numbers, we can factor $$
p_{R}(t) = (t-i)(t+i)
$$so the eigenvalues of $R$ over $\mathbb C$ is $i$ and $-i$, with algebraic multiplicities 1 apart. Since we always have $1 \le\text{gm}(\lambda)\le\text{am}(\lambda)$, we can fill in the _eigen-table_ at least partially $$
\begin{array}{}
\lambda & i & -i \\
\text{am}(\lambda) & 1 & 1 \\
\text{basis for }E_{\lambda} & ? & ? \\
\text{gm}(\lambda) & 1 & 1
\end{array}
$$as the geometric multiplicities are forced, when $1 \le\text{gm}(\pm i)\le\text{am}(\pm i)=1$, we must have $\text{gm}(\pm i)=1$.
So, $p_{R}$ splits over $\mathbb C$ and $\text{am}(\lambda) =\text{gm}(\lambda)$ for each eigenvalue (no defective eigenvalues), so $R$ is diagonalizable over $\mathbb C$! Alternatively, we also have $\sum\text{gm}(\lambda) =2$, which leads to the same conclusion.
If we further want the diagonalization of $R$ over $\mathbb C$, then we need to compute basis sets to the eigenspaces. Note $$
E_{i} = \ker \begin{pmatrix}-i & -1\\1 & -i\end{pmatrix} = \ker \begin{pmatrix}1 & -i\\0 & 0\end{pmatrix} = \operatorname{span} \{\begin{pmatrix}i\\1\end{pmatrix}\}
$$and $$
E_{-i} = \ker \begin{pmatrix}i & -1\\1 & i\end{pmatrix} = \ker \begin{pmatrix}1 & i\\0 & 0\end{pmatrix} = \operatorname{span} \{\begin{pmatrix}-i \\1\end{pmatrix}\}
$$So we have diagonalization $R=PDP^{-1}$ with $$
P = \begin{pmatrix}i & -i \\ 1 & 1\end{pmatrix} \text{ and } D=\begin{pmatrix}i\\ & -i\end{pmatrix}. \quad\blacklozenge
$$
**Example.**
Consider the matrix $A$ where $$
A = \begin{pmatrix}0 & 1 & 1\\2 & 1 & 2\\3 & 3 & 2\end{pmatrix}.
$$You are given that $A$ has eigenvalues $-1,5$ . Determine if $A$ is diagonalizable. And if so, diagonalize it.
$\blacktriangleright$ Let us try to fill in this _eigen-table_ $$
\begin{array}{}
\lambda & -1 & 5 \\
\text{am}(\lambda) & ? & ? \\
\text{basis for } E_{\lambda} & ? & ? \\
\text{gm}(\lambda) & ? & ?
\end{array}
$$
We could use long division to figure what are all the algebraic multiplicities of the eigenvalues, and see if there is any missing eigenvalues. But since we have to compute $E_{\lambda}$ anyway, we start with that. Note $$
E_{-1} = \ker \begin{pmatrix}1 & 1 & 1\\2 & 2 & 2\\3 & 3 & 3\end{pmatrix} = \ker \begin{pmatrix}1 & 1 & 1\\0 & 0 & 0\\0 & 0 & 0\end{pmatrix} = \operatorname{span} \{\begin{pmatrix}-1\\1\\0\end{pmatrix},\begin{pmatrix}-1\\0\\1\end{pmatrix}\}.
$$This shows $\text{gm}({-1})=2$. But what about $\text{gm}(5)$ ? We know $\text{gm}(5) \ge 1$ and that $\text{gm}(-1) +\text{gm}(5) \le 3$, as $A$ is $3\times 3$ matrix. So we are forced to have $\text{gm}(5)=1$, and that the sum $$
\text{gm}(-1) + \text{gm}(5) = 3.
$$Hence we conclude that $A$ is diagonalizable!
Since we also want the diagonalization, we continue to to find $E_{5}$. We have $$
E_{5} = \ker \begin{pmatrix}-5 & 1 & 1 \\
2 & -4 & 2\\3 & 3 & -3\end{pmatrix} = \ker \begin{pmatrix}1 & 0 & -\frac{1}{3} \\
0 & 1 & -\frac{2}{3} \\
0 & 0 & 0\end{pmatrix} = \operatorname{span}\left\{ \begin{pmatrix} \frac{1}{3} \\ \frac{2}{3}\\1\end{pmatrix} \right\}.
$$So we can fill in our _eigen-table_ $$
\begin{array}{}
\lambda & -1 & 5 \\
\text{am}(\lambda) & (2) & (1) \\
\text{basis for } E_{\lambda} & \begin{pmatrix}-1\\1\\0\end{pmatrix},\begin{pmatrix}-1\\0\\1\end{pmatrix} & \begin{pmatrix} \frac{1}{3} \\ \frac{2}{3}\\1\end{pmatrix} \\
\text{gm}(\lambda) & 2 & 1
\end{array}
$$which gives the diagonalization $A = PDP^{-1}$ with $$
P = \begin{pmatrix}-1 & -1 & \frac{1}{3} \\
1 & 0 & \frac{2}{3} \\
0 & 1 & 1\end{pmatrix} \text{ and } D = \begin{pmatrix}-1 \\
& -1 \\
& & 5\end{pmatrix}. \quad\blacklozenge
$$In the next set of notes, we will see some applications and miscellanea useful facts about diagonalization, eigenvalues, and characteristic polynomial.